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Abstract 

In this work, it is described the sequencing and annotation of the genome of the yeast strain ISA1 307, iso- 
lated from a sparkling wine continuous production plant. This strain, formerly considered of the 
Zygosaccharomyces bailii species, has been used to study Z. ba//// physiology, in particular, its extreme tolerance 
to acetic acid stress at low pH. The analysis of the genome sequence described in this work indicates that strain 
ISA1307 is an interspecies hybrid between Z. baUii and a closely related species. The genome sequence of 
ISA1307 is distributed through 1 54 scaffolds and has a size of around 21.2 Mb, corresponding to 96% of 
the genome size estimated by flow cytometry. Annotation of ISA1307 genome includes 4385 duplicated 
genes ('-^90% of the total number of predicted genes) and 1155 predicted single-copy genes. The functional 
categories including a higher number of genes are 'Metabolism and generation of energy', 'Protein folding, 
modification and targeting' and 'Biogenesis of cellular components'. The knowledge of the genome sequence 
of the ISA1 307 strain is expected to contribute to accelerate systems-level understanding of stress resistance 
mechanisms in Z. bailii and to inspire and guide novel biotechnological applications of this yeast species/ 
strain in fermentation processes, given its high resilience to acidic stress. The availability of the ISA1307 
genome sequence also paves the way to a better understanding of the genetic mechanisms underlyingthe gen- 
eration and selection of more robust hybrid yeast strains in the stressful environment of wine fermentations. 
Key words: Zygosaccharomyces bailii; hybrid yeast strains; weak acid food preservatives tolerance; wine yeast 
strains; genome sequencing and annotation 



1. Introduction problematic to the food and beverage industries, with 

the Z. bailii species representing the most significant 
Among food spoilage yeasts, those belonging to the spoilage yeast within the genus, specially in acidic 
genus Zygosaccharomyces are considered the most food products.^'^ Regardless of the progress achieved 
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in product formulation and in the control and develop- 
ment of improved sanitation technologies,/. fafl;7/;still isa 
majorchallengingthreatofspoilage in mayonnaise, salad 
dressings, sauces, pickled or brined vegetables, fruit con- 
centrates a nd va rious non-ca rbonated f ruit d ri n ks as wel I 
as other acid if led foods.^ Zygosaccharomyces bailii is also 
a significant spoiler of wines.^ The success ofZ bailii as a 
spoilage yeast results from a number of physiological 
traitsof the species, in particular, its remarkable resilience 
against weak acids used as food preservatives such as 
acetic, benzoic, propionic, sorbic acids and sulphur 
dioxide, being able to proliferate in the presence of con- 
centrations which are frequently above the permitted 
values by some food legislations. ''^'"^Zj/^osflcc/rflromyces 
bailii is also able to tolerate high concentrations of 
ethanol and other sanitizers and to grow in a wide 
range of pH (2.0-7.0) and water activities (0.80- 
0.99).' Zygosaccharomyces bailii is known to vigorously 
ferment hexoses and, like other members of the 
Zygosaccharomyces genus, Z. bailii exhibits a frutophilic 
behaviour metabolizing fructose at a higher rate than 
glucose when the two carbon sources are present in the 
growth medium.^'^'^ Moreover, Z. bailii is able to cause 
spoilage from an extremely low inoculum, to tolerate 
moderate osmotic pressure and to grow at high growth 
rates under oxygen-restrictive conditions.^'^ Food pro- 
ducts that are preserved at low pH, low water activities 
or low oxygen concentrations, and that contain adequate 
amounts of fermentable sugars, are therefore at a par- 
ticular risk of spoilage by this yeast, causing significant 
economic losses for the industriesthat produce and com- 
mercialize these prod ucts. Zygosaccharomyces bailii is a Iso 
frequently isolated in wine fermentations and although 
this is generally considered detrimental, potential bene- 
ficial effects have also been proposed. ^'^ This yeast 
species is a potential new host for biotechnological pro- 
cesses.'"'' ' In particular, it is an attractive candidate to 
allow fermentation processes to be performed under 
otherwise restrictive conditions, or to be used in heterol- 
ogous protein and metabolite production due to its high 
resilience to a number of environmental stresses, high 
specific growth rate and high biomass yield.' °'" The 
use of Z. bailii was already found to be successful for the 
production of lactic acid, i-ascorbic acid (vitamin C) and 
vitamin B1 2."''^ 

Differently from Saccharomyces cerevisiae,^^ the ex- 
ploitation of Omic strategies in Zygosaccharomyces 
yeasts has been severely limited by the absence of avail- 
able genome sequences for species of this genus. The 
genome of Z. rouxii CBS732, completed in 2009,'"^ 
was the first genome sequence of this genus being dis- 
closed and only very recently the genome sequence of 
theZ. bailii type strain CLIB21 3''" (=ATCC58445), was 
released.'^ Therefore, until today, most of the studies 
dedicated to Z. bailii only explored gene-by-gene 
approaches.^'' A quantitative proteomic analysis. 



based on quantitative two-dimensional gel electro- 
phoresis (2-DE), was however recently performed to 
elucidate the mechanisms underlying the adaptive re- 
sponse and intrinsic high tolerance of Z. bailii cells to 
sub-lethal concentrations of acetic acid. ^' A coordinate 
increase in the content of proteins involved in carbohy- 
drate metabolism and energy generation as well as in 
general and oxidative stress response was registered.^' 
Results reinforced a previously established concept 
that glucose and acetic acid are co-consumed in Z. 
bailii, with acetate being channelled into the tricarb- 
oxylic acid cycle.' ^'^''^^ When acetic acid is the sole 
carbon source, results suggest the activation of gluco- 
neogenic and pentose phosphate pathways, based on 
the increased content of several proteins of these path- 
ways after glucose exhaustion.^' The lack of a genome 
sequence forZ. bailii limited this expression proteomic 
analysis, given that only 40% of the differently expressed 
proteins could be identified by peptide mass finger- 
printing.^' The development of molecular biology 
tools forZ. bailii, such as the isolation of stable auxo- 
trophic mutants and release of a set of vectors allowing 
ectopic gene expression, is also relatively recent. '° 

In this article, we describe the sequencing and anno- 
tation of strain ISA1307, isolated from a continuous 
sparkling wine production plant.^^ Here, we also 
provide evidences supporting the notion that this 
strain, formerly considered of the Z. bailii species, is an 
interspecies hybrid between Z. bailii and another 
closely related yeast species. The phylogenetic rela- 
tionships of a large cohort of isolates first classified 
as Z. bailii were recently re-examined, and significant 
differences in their rRNA gene sequences and genome 
fingerprinting patterns were found, leading to the dis- 
tribution of these isolates into three species: Z. bailii, 
Z. parabailii andZ. pseudobailii}'^ Despite the differences 
registered at the molecular level, the Z. bailii species 
could not be distinguished from the other two novel 
species using physiological tests.^"^ The occurrence in 
wines of natural hybrid strains generated by hybridiza- 
tion of different Saccharomyces species is widely 
described in the literature,^^'^^ the lagger brewing yeast 
Saccharomyces pastorianus being the most paradigmatic 
example.^^ The occurrence of hybrid strains within 
the Zygosaccharomyces genus involving, at least, the Z. 
rouxii, Z. pseudorouxii and Z. mellis species, was also 
reported. The ISAl 307 strain focused on our work 
has been used in several studies conducted to examine 
different aspects of Z. bailii physiology, in particular, its 
extreme tolerance to acetic acid (minimum inhibitory 
concentration value for acetic acid in the range of 2 70- 
420 mM compared with 80 mM for S. cerevisiae^^ '^^ 
and our unpublished results), metabolism of fructose 
and glucose^'^^ and growth underoxygen-restrictive con- 
ditions.^ A genomic library from strain ISAl 307 was con- 
structed^^ and successfully used forfunctional analysis of 
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several relevant genes.^'^ ^ Considering that formation of 
hybrids in the stressful environment of wine fermenta- 
tions has been associated with improved strain robustness 
strains,^^ it is expected that the sequencing and annota- 
tion of ISAl 307 genome reported in this work may be 
used to inspire and guide novel biotechnological applica- 
tions of this strain and Z. bailii species under otherwise 
restrictive process conditions. The availability of the 
ISAl 307 genome sequence will also open the door to a 
better understanding of the genetic mechanisms under- 
lying the generation of hybrid strains in the stressful envir- 
onment of wine fermentations. 

2. Materials and methods 

2.7. Strains and growth medium 

The prototrophic yeast isolates ISAl 307,^^ Z. bailii 
ATCCSSAAS"^ (=CLIB21 3"^) and the laboratory strains 
acquired from the Euroscarf collection S. cerevisiae 
BY4741 (genotype MATa; his3A 1; leu2A 0; lys2A 0; 
ura3A 0) and S. cerevisiae BY4743 (genotype MATa/a 
his3M /his3M leu2A0/leu2A0 LYS2/iys2A0 metlSAO/ 
METIS ura3A0/ura3A0) were used in this study. 
Strains were maintained and cultivated in rich YPD 
growth medium which contains, per liter, 2% glucose 
(Merck), 2% yeast extract (Difco) and 2% peptone 
(Difco). 

2.2. Quantification of ISA 1 307 and S. cerevisae 
total genomic DNA by flow cytometry 
Quantification of total genomic DNA from S. cerevi- 
siae (strains BY4741 and BY4743) and of the hybrid 
strain ISAl 307 was performed using a SYBR Green I- 
based staining protocol, as described before.^'^ Briefly, 
cells batch cultured in YPD growth medium, at 26°C, 
until mid-exponential phase (ODgoonm of 1.0 + 0.01; 
1 0^ cells for each species), were harvested by centrifu- 
gation, washed with H2O and fixed overnight in 
0.5 ml of 70% ethanol (vol/vol). Fixed cells were col- 
lected by centrifugation, washed with 50 mM of 
sodium citrate buffer (pH 7.5) and re-suspended in 
750 |jlL of this same buffer supplemented with 1 mg 
of RNAseA.After 1 hof incubation at 50°C, 1 mgof pro- 
teinase K was added to the cell suspension and the 
mixture was left at 50°C for another hour. Cells were 
subsequently stained using 20 jjlL of SYBR Green I 
working solution (corresponding to a 500-fold dilution 
of the commercial solution). Samples were sonicated at 
low power and analysed in an Epics® XL™ (Beckman 
Coulter) flow cytometer equipped with an argon ion 
laser emitting a 488-nm beam at 1 5 mW. The green 
fluorescence was collected through a 488-nm blocking 
filter, a 550-nm/long-pass dichroic and a 525-nm/ 
bandpass. Thirty thousand cells per sample were ana- 
lysed to obtain the cell cycle profiles shown in Fig. 1. 



The mean fluorescent intensities obtained forS. cerevi- 
siae BY4741 and BY4743 were used to build a calibra- 
tion curve from which it was estimated the size of the 
genome of the ISAl 307 strain. 

2.3. Karyotyping of the ISAl 307 strain 

Intact DNA for pulsed field gel electrophoresis (PFGE) 
was prepared in plugs as previously described. 
ISAl 307 andZ. l7fl//;7ATCC58445 cells,cultivated over- 
night at 26°C in YPD growth medium, were harvested 
by centrifugation, washed twice with 0.05 M EDTA, pH 
8.0 and resuspended at a concentration of 1.2 x 1 0^ 
cells/ml in 0.05 M EDTA containing 3 mg/ml of 
Zymolyase 1 OOT for digestion. Plugs were formed by 
mixing the suspension of cells with the same volume 
of low melting agarose 2% (SeaPlaque; Cambex Bio 
Science, Rockland, ME, USA) at 40°C. Plugs were then 
incubated overnight in 0.45 mM EDTA, pH 8.0 and 
7.5% (vol/vol) 2-mercaptoethanol at 37°C. After this 
incubation step, plugs were washed three times in 
Tris/EDTA buffer (1 0 mM Tris, pH 8.0 and 1 mM EDTA, 
pH 8.0) and incubated overnight in 0.5 M EDTA, 
1 0 mM Tris, pH 8.0, 1 mg/mL of proteinase K (Sigma- 
Aldrich) and 1 % sodium-N-lauryl sarcosinate at 50°C. 
After washing five times, during 30 min each, with TE, 
pH 8.0, at room temperature, samples were stored at 
4°C. PFGE was performed in a CHEF-DRII Chiller 
System (Bio-Rad, Hercules, CA, USA). PFGE gels were 
run in 0.5% Tris borate-EDTA buffer at 1 2°C with an 
angle of 1 20° with a voltage of 3 V/cm and switch 
timesof 300 sfor 1 20 h. 

2.4. Genome sequencing, assembly and annotation 
The genome of the ISAl 307 hybrid strain was 

sequenced at CD Genomics (New York, USA) using a 
whole-genome shotgun approach that explored paired- 
end lllumina sequencing. Details on the methods used 
for genome sequencing, assembly and subsequent anno- 
tation are described in Supplementary Material. 

3. Results and discussion 

3.1 . The ISAl 307 strain is an interspecies hybrid 
between Z. bailii and a closely related species 
Following the analysis of the ISAl 307 strain genome 
sequence described below and given that yeast isolates 
formerly identified asZ. bailii \Nere recently reclassified 
in theZ. bailii, Z. parabailii andZ. pseudobailii spedes,^'^ 
we have examined the taxonomic classification of this 
strain. The sequences of the house-keeping genes 
RPB1 , RPB2, EF1-a and (3-tubulin were compared. 
These gene sequences were proposed as sequences 
with a very high capacity to discriminate Z. bailii, Z. 
parabailii and Z. pseudobailii species.^"^ Only one copy 
of the RPB1 gene was found in the ISAl 307 genome, 
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Figure 1. Estimation of genome size and l<aryotyping of tlie ISA1 307 strain. (A) Representative cell cycle analysis histogram of S. cerevisiae 
BY4741 or BY4743 (in black) and ISAl 307 (in grey). ISAl 307 and S. cerevisiae cells were cultivated in YPD growth medium until 
stationary phase and then labelled with SYBR Green I to stain genomic DNA. Mean fluorescent intensities (MFI) of Go/G, peaks of the 
cell cycle histogram were estimated by flow cytometry. The MFI values obtained for the two S. cerevisiae strains were used to build a 
calibration curve that was used to calculate the size of the genome of the ISAl 307 strain (12.16 Mb for the size of the genome of the 
haploid strain S. cerevisiae BY4741 ). (B) Karyotype of the reference strain Z. baiiii ATCC58445 (lane 2) and of the ISAl 307 strain (lane 
3). Total genomic DNA of both yeast species cultivated in YPD growth medium until stationary phase was separated by PFGE. The size of 
ISAl 307 high-molecular-weight chromosomes was estimated based on the high-molecular-weight standard [Hansenuia wingei 
(Bio-Rad) — lane 1 ], while the size of low-molecular-weight chromosomes was estimated usingS. cerevisiae chromosomes' size (not shown). 



this being identical to the corresponding orthologue 
annotated in the genome of several Z. bail'ii strains 
(Supplementary Material). The (3-tubulin, RPB2 and 
EF1-a genes are duplicated in the genome of the 
ISAl 307 strain, with one allele being almost identical 
(>99% identity at the nucleotide level) to the corre- 
sponding orthologue found in Z. baiiii strains and the 
other allele being identical to the orthologues found 
in Z. parabailii strains^'* (Supplementary Material). 
The analysis of the genome sequence revealed that 
this allelic divergence is registered in ~90% of the 
genes found to be duplicated in the ISAl 307 genome 
(see below). Altogether, these results strongly suggest 
that the ISAl 307 strain is an interspecies hybrid 
between Z. baiiii and a closely related species. The 
results obtained for sequences of (3-tubulin, EF1-a and 
RPB2 genes appear to suggest that Z. parabailii could 
be the other parental species. A closer inspection to 
the seq uences of these genes deposited for the d ifferent 
strains classified asZ. parababilii by Suh etal.^'^ showed 
the existence of multiple ambiguous positions, which 
suggests that these sequences already have been 



obtained by amplification of divergent alleles. Thus, 
we hypothesize that the strains previously classified as 
Z. parabailii could be hybrid strains. This hypothesis is 
in line with the reported inability of Z. parabailii 
ATCC56075 (=NCYC1 28) to undergo meiotic sporu- 
lation,^^ a phenotypic trait common in hybrid 
strains^^'^^ and also described for ISAl 307.^ ^ 



3.2. Karyotyping and estimation of total DNA 
content of the ISA 1 307 strain 
To estimate the size of ISAl 307 genome, exponential 
cells were fixed and DNA was quantified by flow cytome- 
try using the fluorescent probe SYBR Green 1.^ ^ Cell cycle 
analysis revealed that the intensity of the Gq/Gi peak 
exhibited by ISAl 307 cells is 1.7-fold higher and 1.1- 
fold lower than the values registered for the reference 
strains S. cerevisiae BY4741 (haploid) and BY4743 
(diploid), respectively (Fig. 1 A). ConsideringthatS. cerevi- 
siae BY4741 has a size of 12.16Mb (www.yeastge 
nome.org), the estimated size of ISAl 307 total DNA is 
~22.0Mb (Table 1). To complement this analysis. 
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Table 1. Genome assembly statistics 
interspecies hybrid strain ISA1 307 


of the Z. fefl/7;7-derived 


Total reads 


120000 000 


No. of scaffolds 


1 54 


Coverage 


x600 


N50 (bp) 


232 974 


Maximum contig length (bp) 


806 952 


Minimum contig length (bp) 


2160 


Average contig length (bp) 


1 37 280 


Assembly size (bp) 


211411 52 



The most significant parameters associated with assembly 
of the reads that were obtained after sequencing of the 
ISA1 307 genome are indicated. 



PFGE was used to separate ISA1307 genomic DNA. 
Under the experimental conditions used, 1 3 chromo- 
somal bands were observed, with sizes ranging from 
733 to 21 20 Mb (Fig. 1 B). PFGE profiling of the type 
strain Z. bailii ATCC58445 (=CLIB21 3"^) was also per- 
formed, and five chromosomal bands were observed 
(Fig. 1 B).This result is in line with a previous publication, 
suggesting that the ISA1307 strain has at least three 
more chromosomes than theZ. bailii type strain.^ ^ The 
sum of the PFGE bands is ~1 9 Mb, differing by 3 Mb of 
the total amount of DNA that was estimated by flow 
cytometry, a gap that can result from co-migration of 
chromosomal bands in the PFGE gel. In fact, it is possible 
that the first two bands (21 20 and 1 981 kb) and even- 
tually the last band (730 kb) are duplicated, based on 
their higher intensity, compared with the other bands 
observed in the gel (Fig. 1 ). 

3.3. Assembly of ISA1 307 genome 

Two rounds of paired-end lllumina sequencing 
(inserts with ~3 50 bp, 1 00 base reads) were carried 
out to obtain the sequence of ISA1 307 genome. 
Around 1 20 Gb of readings were acquired yielding a 
genome coverage of 600 fold. The de novo assembly of 
the reads was carried out using SOAPde novo assem- 
bler^^ resulting in 190 scaffolds. After the assembly 
process, the sum of the scaffolds size obtained 
(21.1 Mb) was well above the size expected for a 
haploid genome (which would be ~ 1 1 Mb), indicating 
that the duplicated sequences from the homeologous 
chromosomes (homologous chromosomes acquired 
from two different species) of the ISAl 307 strain were 
not aligned in a unique consensus sequence. The 
same had also been obtained during genome sequen- 
cing of other interspecies hybrid strains, such asS. pas- 
torianus or Pichia sorbitophila,^^'^'^ this being 
attributed to the different origin of the homeologous 
chromosomes that compose the genome of hybrid 
strains. To reconstruct the genome sequence of the 



ISAl 307 strain, we have used a similar approach to 
the one used to assemble the genome of other hybrid 
yeast strains.^^'^^''^" Briefly, 1 90 scaffolds with homolo- 
gous genes were detected (using an all-against-all 
BLASTP analysis) and then sequentially ordered based 
on the search of syntenic blocks with the genomes ofZ. 
rouxii CBS732 and S. cerevisiae S288c. These yeast 
species were selected for this analysis, since they are 
phylogenetically close to Z. bailii and their genomes 
are well annotated and available in public databases 
(Genolevures database and Saccharomyces Genome 
database"^^ or CYGD,'^^ respectively). The junction 
points between scaffolds predicted to be contiguous by 
our synteny-based in silico analysis were tested by PGR 
to confirm correct scaffold positioning, and the existing 
gaps were closed by sequencing the amplification 
product. A summary of the genome assembly statistics 
is summarized in Table 1 . The final reconstructed 
genomic sequence of the ISAl 307 strain is distributed 
over 154 scaffolds with sizes ranging from 2160 to 
806 952 bp. The sum of all scaffolds size is 21 141 
1 52 bp (Table 1), which corresponds to 96% of the 
genome size that was estimated by flow cytometry 
(see above). The sequence of the genome of the 
ISAl 307 strain and the subsequent annotation per- 
formed was deposited in the European Nucleotide 
Archive (ENA, http://www.ebi.ac.uk/ena/data/view/ 
CBTCOl 0000001 -CBTCOl 00001 54). Although a 
genome sequence for the type strain Z. bailii CLIB2 1 3^ 
has been recently published,^ ^ this was only released 
after the assembly of ISAl 307 genome. A comparative 
genomic analysis between the genomes of ISAl 307 
and Z. bailii CLIB21 3^ (discussed below) suggests that 
the genome of the two parental species are interspersed 
inthegenomeofthehybridstrain ISAl 307,whichshows 
that the use of Z. bailii CLIB2 1 3^ genome as a reference 
fortheassembly processof ISAl 307 genomewould have 
been disadvantageous, compared with the strategy that 
we have used which was based on the use of S. cerevisiae 
and Z. rouxii genomes. 

3.4. Annotation and structure of ISAl 307 genome 

To annotate protein-encoding genes in the genome 
sequence of the ISAl 307 strain, a combination of ab 
initio and homology methods were applied using the 
gene structure of S. cerevisiae S2 88c and Z. rouxii 
CBS732 genes as references. In total, 9925 genes are 
predicted to be encoded by the genome of the hybrid 
ISAl 307 strain, 90% of these being considered dupli- 
cated genes (corresponding to 4385 gene pairs) 
(Supplementary Table SI ) since the encoded proteins 
share >50% Simap similarity at the amino acid level 
(listed in Supplementary Table SI). The number of 
genes predicted to be encoded by the genome of this 
strain is around twice that of genes annotated for the 
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type strain Z. bailii CL\B2^ 3^^ ^ The scaffolds encoding 
homologous genes were indicated by suffices 'A' and 
'B' in the scaffold names to reflect the existence of two 
orthologoussets of scaffold sequences. Sixteen scaffolds 
lacking clear orthologous sequences remained and 
were maintained as singletons labelled with the 's' 
suffix. A MUMmer'^'^ alignment of the A and B scaffolds 
indicates a base identity of 92.6% [for a total of 89.5% 
(A) and 94.0% (B) of aligned bases], consistent with 
the proposed hybrid nature of the ISA1307 genome. 
In line with this difference, variation in the sequence 
of the ISAl 307 duplicated genes was also observed 
(Supplementary Table S2). For 90% of the ISAl 307 
duplicated genes, it was found that one of the alleles 
was almost identical to the corresponding gene found 
inZ. l7fl//;7CLIB21 3'^(99-l 00% identity at the nucleo- 
tide level), while the sequence of the other allele was 
less similar (94-98% identity at the nucleotide level) 
(Supplementary Table S2). Notably, the ISAl 307 gene 
alleles presumed to have originated fromZ. bailii (that 
is, those identical to genes found in the CLIB213^ 
strain) were distributed between A and B scaffolds 
(Supplementary Table S2), indicating that the genetic 
information coming from this species is, apparently, 
not confined to only one of the homeologous chromo- 
somes of the ISAl 307 strain probably due to the 
occurrence of chromosomal rearrangements after 
hybridization of the parental strains. The differences in 
the two alleles of ISAl 307 duplicated genes registered 
at the nucleotide level had almost no impact in the 
sequence of the encoded proteins since only six 
gene pairs (ZBAI_07571/ ZBAI_01790; ZBAI_06324/ 
ZBAI_01930;ZBAI_09856/ZBAI_05001;ZBAI_07267/ 
ZBALOl 1 73;ZBAI_08269/ZBAI_03260; ZBAI_08169/ 
ZBAI_01 798) exhibited a rate of non-synonymous substi- 
tutions (dN) and synonymous substitutions (dS) above 1 . 

The general features of ISAl 307 genome, in particu- 
lar, gene density, average GC content, numberof tRNAs 
and number of rRNA locus are consistent with those 



described for other hemiascomycetous yeasts, in par- 
ticular, for S. cerevisiae S288c and Z. rouxii CBS732^'^ 
(Table 2). The average gene length of all genes is 
1471 bp and the incidence of introns is ~3%, in line 
with the results obtained forS. cerevisiae S288c andZ. 
rawx//CBS732 (Table 2).^ Around 97%(9631 genes) 
of the genes are predicted to be intron-free. The 
remaining genes are predicted to have two (277 
genes) or three or more exons (1 7 genes), similarly to 
the S. cerevisiae S288c and Z. rouxii CBS732 genes 
(results not shown). No significant differences were 
registered in the gene structure located in A and B scaf- 
folds (results not shown), which is compatible with the 
anticipated genetic relatedness of the parental species 
that originated the ISAl 307 strain. 

The sequence and annotation of the genome of the 
ISAl 307 strain disclosed in this study are accessible at 
http://pedant.helmholtz-muenchen.de/genomes.jsp? 
Category=fungal, including browsing by a GBrowse 
instance.'^^ To allow a comparative navigation through 
the genome of the hybrid strain with the genomes of 
Z. rouxii CBS732, S. cerevisiae S288c and Z. bailii 
CLIB213^, a GBrowse_syn instance is accessible 
under http://mips.helmholtz-muenchen.de/gbrowse2 / 
cgi-bin/gbrowse_syn/zbailii. Massive genomic rearran- 
gements seem to have occurred since the differentiation 
ofZ bailii, Z. rouxii and of the ISAl 307 strain from S. cer- 
evisiae because the genetic information contained in the 
chromosomes of the buddingyeast isdispersed through- 
out Z. roux;7CBS732 chromosomes and throughout the 
scaffolds of the genome of ISAl 307 and of Z. bailii 
CLIB21 3"^ (Fig. 2A). The genomes of ISAl 307, Z. bailii 
CLIB2 1 3^ and Z. rouxii CBS732 genomes are more syn- 
tenic, reflecting the close phylogenetic distance 
between these strains, however, the existence of large 
gaps is still evident (Fig. 2B). In general, a high degree of 
sinteny was observed between the two homeologous 
scaffolds of the ISAl 307 strain and scaffolds of Z. bailii 
CLIB2 1 3^ (Fig. 2 C and D), consistent with Z. bailii being 



Table 2. General features of ISAl 307, Z. roux;7CBS732 andS. cerei'/s/oe S288c genomes 

Strain No. of Ploidy Genome Average GC Total no. Genome- Average GC Average CDS % CDS with 

chromosomes size (Mb) content (%) of CDS coding in CDS (%) length (bp) introns 

coverage (%) 

ISAl 307 1 3 (~2«) 22 42.4 9931 69.8 43.8 1471 3 

Z. rouxii 7 n 12.3 39.1 4992 76.1 40.2 1491 3-6 

CBSl 38 

S. cerevisiae 16 n 12.3 38.3 5769 70.0 40.3 1464 4.5 
S288C 

For the ISAl 307 strain genome, each parameter indicated in the table was calculated from the final reconstructed genomic 
sequence after annotation. ISAl 307 genome size was calculated based on the results obtained by flow cytometry shown in 
Fig. 1 A. Average gene density represents the fraction of each genome occupied by protein-coding genes (other genetic elements 
were not considered). Information fromZ roux/7CBS732 andS. cereb'/s/fleS2 8 8c genomes were taken from'* coding sequences 
(CDS). The size of the chromosomes was estimated based on the results obtained in the PFGE shown in Fig. 1 . 
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Figure 2. Multigenome alignment of genomic regions of S. cerevisiae S288c,Z. rouxii CBS732,Z. ijailii CLIB21 3Tand the interspecies hybrid 
strain ISA1 307. In this picture are shown aligments of Z. bailii CLIB21 3^, S. cerevisiae S288c, Z. rouxii CBS732 and the hybrid strain 
ISA1 307 centered in different genomic regions. Each coloured square represents a different scaffold found in the genomes of Z. bailii 
CLIB21 3^or of ISAl 307 or represents a chromsomes of S. cerevisiae S288c orZ. rouxii CBS732. Conserved synteny blocks are shown in 
shaded boxes. This image was obtained using the multigenome alignment GBrowse_syn (http;//mips.helmholtz-muenchen.de/ 
gbrowse2/cgi-bin/gbrowse_syn/zbailii). 



306 



Genome of aZ. bailii-Der'wed Hybrid Strain 



[Vol. 2 1 , 



oneof the parental species of the ISAl 3 07 strain and with 
the hypothesis that the other parental strain is phylogen- 
etically close toZ bailii. 

Twelve putative centromere-like sequences were 
found in the 1 54 scaffolds (Supplementary Table S3) 
that compose the ISAl 307 genome,this beingcompat- 
ible with the 1 3 chromosomal bands obtained in the 
PFGE analysis (Fig. 1 B). The structure of the centromere 
sequences obtained is similar to the sequences 
described for point centromeres of hemiascomycetous 
yeasts^'*: two conserved domains CDE I and CDE III 
interspersed by an AT-rich CDE II domain (ranging 
from 69 to 82%— AT content) (Supplementary Table 
S3). Although the assembly and annotation above 
described indicate the existence of large duplicated 
genomic regions in the genome of the ISAl 307 strain, 
only two or three of the 1 3 chromosomal bands 
obtained in the PFGE gel seem to be duplicated (Fig. 1 ; 
see above). It is not possible to fully elucidate the struc- 
ture of ISAl 307 genome solely with the data available; 
however, the results of genome sequencing and karyo- 
typing (Fig. 1 ) suggest that the genome of this hybrid 
strain includes chromosomes composed by highly 
similar homeologous chromosomes (presumably cor- 
responding to the duplicated bands observed in the 
PFGE gel) and chromosomes composed by more 
dissimilar homeologous chromosomes (presumably 
corresponding to the different-sized single bands 
observed in the PFGE gel). Like ISAl 307, other yeast 
hybrid strains have also been demonstrated to have 
complex genome structures.^^'^^'^^'"^" 

3.5 Origin of ISA! 307 predicted proteins 

The vast majority of the proteins predicted to be 
encoded by the genome of the ISAl 307 strain have 
their best homologue with proteins found in yeast 
species phylogenetically close to the Z bailii species, 
namely Z. rouxii, Torulaspora delbruecldi, S. cerevisiae or 
other yeasts of the Sacharomycetecea family (results 
not shown). However, it was possible to identify in the 
predicted proteomeofthe ISAl 307 strain at Ieast42 pro- 
teins that share a high degree of similarity with proteins 
found in species distant from the Sacharomycetecea 
family (e.g. Candida tenuis, Hansenula polymorpha, 
Schizosaccharomyces pombe) or even in moulds (e.g. 
Aspergillus niger, Penicillium digitatum or Fusarium oxy- 
sporum) (Supplementary Table S4). Six ISAl 307 pre- 
dicted proteins seem to have a bacterial origin since 
their closest homologues are proteins found in 
Burkbolderia cenocepacia, Burkbolderia terrae or Dickeya 
dadantii (Supplementary Table S4). The occurrence of 
prokaryote-to-eukaryote and eukaryote-to-eukaryote 
gene transfers has been demonstrated in S. cerevisiae 
and in several other fungi.'^^ The physiological function 
of the proteins that seem to have been acquired by the 



ISAl 307 by gene transfer is widespread including a puta- 
tive Cu, Zn-superoxide dismutase, putative transporters 
involved in the uptake of monocarboxylates, amino 
acids and urea, two permeases similar to multi-drug 
resistance (MDR) transporters of the Major Facilitator 
Superfamily (MFS), one enzyme required for cata holism 
of mannose and one enzyme required for metabolization 
of 1 -aminocyclopropane-1 -carboxylate, an intermedi- 
ate in the biosynthesis of the plant hormone ethylene 
(Supplementary Table S4). Extensive genomic analysis 
has demonstrated that the acquisition of novel genes by 
fungi, coming from another fungi or coming from a pro- 
karyote, is often associated with increased cellularfitness 
to proliferation in the corresponding ecological niche."^^ 
Remarkably, 1 7 of the proteins that seem to have been 
acquired by the ISAl 307 strain do not have an ortholo- 
gue in Z. bailii CLIB2 1 3^ (Supplementary Table S4), sug- 
gesting that they might have been acquired after the 
hybridization process. 

3.6. Functional categorization of ISAl 307 genes 

The function of the 9931 gene loci predicted to be 
encoded by the genome of the ISAl 307 strain was clus- 
tered according to their physiological function using 
the FunCatDB functional catalogue'*^ (Fig. 3). The 
highest number of genes were found in the functional 
classes of 'Metabolism and generation of energy' (35% 
of the total of predicted genes), 'Protein folding, modi- 
fication and targeting' (2 5% of the predicted genes) 
and 'Biogenesis of cellular components' (21% of the 
predicted genes) (Fig. 3). The functional categorization 
of the ISAl 307 genome is, in general, similar to the one 
obtained for S. cerevisiae S288c or Z. rouxii CBS732 
genomes (Supplementary Fig. SI). Genes encoding 
transposable elements were found to be very scarce in 
the ISAl 307 genome (Supplementary Fig. SI). The 
more abundant motifs found in the proteins predicted 
to be encoded by the ISAl 307 genome were: (i) the 
WD40/YTVN motif, present in signal transducing 
G-proteins or in actin-interacting proteins, (ii) kinases- 
associated motifs, (iii) motifs present in NADP^- 
binding enzymes, (iv) the armadillo motif, found in 
protein phosphatases and in initiation translation 
factors and (v) signature motifs of transporters of the 
MFS (Fig. 4). These motifs were also the more abundant 
motifs found inS. cerevisiae S288c or Z. rouxii CBS732 
proteomes (results not shown). 

3.7. ISAl 307 genes involved in metabolism and 
transport of carbohydrates 

Genes encoding enzymes involved in all major path- 
ways of central carbon metabolism were found in 
ISAl 307 ORFeome,includingenzymes of the glycolytic 
pathway, TCA cycle, neoglucogenesis, pentose phos- 
phate pathway and the anaplerotic enzymes isocitrate 
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Figure 3. Functional classes of genes predicted to be encoded by the genome of the ISAl 307 strain. The genes predicted by the annotation of 
the genome of the ISAl 307 strain (detailed in Section 2) were clustered according to their biological function using the FunCatDB. The 
number of genes included in each functional category is indicated. 
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lyase, pyruvate carboxylase, phosphoenol pyruvate car- 
boxykinase and malic enzyme (Supplementary Fig. S2 
and Table S5). Based on the genome annotation, it is 
anticipated that the respiratory chain of ISAl 307 cells 
includes two mitochondrial NADH dehydrogenases 
(one located in the inner mitochondrial membrane 
and another in the outer membrane), one FADH : 
fumarate dehydrogenase (Complex II), one cytochrome 
C: ubiquinone reductase (Complex III) and one cyto- 
chrome C oxidase (Complex IV) (Supplementary Fig. S2 
and Table S5). The hybrid strain ISAl 307 does not 



seem to have a functional complex I, like all the other 
described yeasts able to perform aerobic alcoholic fer- 
mentation,"^^ nor does it have alternative oxidases to 
perform cyanide-resistant respiration. This organization 
supportsthe ideathatthe hybrid strain ISAl 307 obtains 
energy from the respiratory chain through the proton 
gradient generated by Complexes III and IV, which is 
consistent with previous studies demonstrating the 
high sensitivity exhibited by Z. bailii strains to the 
cytochrome-C reductase inhibitor antimycin.^° Genes 
encoding enzymes required for the catabolism of 
galactose, glycerol, acetate, ethanol and fructose were 
also identified in the genome of the ISAl 307 strain 
(Supplementary Fig. S2 and Table S5), consistent with 
the described ability of this strain to use all these 
carbon sources.^^ Enzymes required for catabolism of 
xylose, sorbose, sorbitol, inulin and glucose-based poly- 
saccharides were also found in the genome of the 
ISAl 307 strain (Supplementary Fig. S2 and Table S5). A 
putative lactate dehydrogenase (LDH) (encoded by the 
ZBAI_09900 gene) was also found, suggesting that 
ISAl 3 07 cells may be able to perform lacticfermentation 
in alternative to alcoholic fermentation. Interestingly, we 
could not find in the genome of Z. bailii CLIB21 3^, an 
orthologue for this putative LDH enzyme found in the 
genome of the ISAl 307 strain (Supplementary Table 
S5), indicating that it may have been acquired from the 
other parental species of the ISAl 307 strain. 

Nine putative hexosetransporterssimilartothe well- 
characterized S. cerevisiae Hxt transporters are included 
in the predicted 'transportome' of the ISAl 307 strain, 
as well as two transporters of the sugar porter family, 
one transporter similar to the Kluyveromyces lactis 
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Table 3. Conservation of the Snfl -signalling pathway in S. cerevisiae, in Z. bailii CLIB21 3^ and in the interspecies hybrid strain ISAl 307 



S. cerevisiae Function in glucose repression pathway 
gene 



ISAl 307 
homologue ORF 



Z. fafl/;//CLIB21 3^ 
homologue ORF 



Snfl pathway 
SAKl 

SNFl 

SI PI 

SIP2 
GAL83 

SNF4 
MIG1 

MIG2 
MIG3 



Activates Snfl kinase by phosphorylation in glucose starvation or non- ZBAI_08236 BN860_061 28g 

fermentable carbon sources 

Kinase that is activated in response to low glucose concentrations or the ZBAI_021 62/ BN860_1 01 32g 

presence of non-fermentable carbon; inactivates Migl by ZBAI_0801 6 

phosphorylation 

Regulatory subunit of Snfl involved in response to low and high external ZBAI_03741 BN860_03840g 
glucose concentrations 

Regulatory subunits of Snfl that are for activation of the kinase in ZBAI_06706/ BN860_041 70g 

response to non-fermentable carbon sources ZBAI_04665 

Activatingsubunit of Snfl ; activates glucose-repressed genes and ZBAI_01 886/ BN860_1 2662g 

represses glucose-induced genes ZBAI_06368 

Transcriptional repressor of low affinity hexose transporters and of ZBAI_06392/ BN860_1 2046g 

transcription factors Cats, Hap4 and Adrl involved in response to ZBAI_06707 BN860_04148g 

non-fermentable carbon sources ZBAI_001 88 

Co-operates with Migl in glucose repression 

Transcriptional regulator required for glucose repression in wild-type 
S. cerevisiae isolates; inactivated in the laboratory strain S288c 



Proteins from ISAl 307 and fromZ. bailii CUQ2^ 3^ homologous to the S. cerevisiae proteins described to belong 
signalling path way. ^■^ The physiological function of theS. cerevisiae proteins is based on the information available 
myces genome database. 



to the Snfl - 
at saccharo- 



glucose/fructose/ga lactose transporter Hgtl and several 
predicted hexose-like transporters of uncharacterized 
function (Supplementary Table S5). Fructophilicity, one 
of the main physiological characteristics that distin- 
guishes the Z. bailii species, is retained in the ISAl 307 
strain. The activity of the highly specific fructose trans- 
porter Ffzl and the repression of glucose transport by 
the presence of fructose are considered to be on the 
basisof fructophilicity of the ISAl 307 strain.^'^^ Besides 
Ffzl (ORF ZBAI_03578), three other genes encoding 
transporters highly similar to Ffzl were found in the 
ISAl 307 genome (Supplementary Table S5). Three of 
the four Ffzl -like genes found in the genome of 
ISAl 307 were also present in the genome of Z. bailii 
CLIB21 3^ (Supplementary Table S5). Interestingly, one 
sugar transporter (ZBAI_01 802) that is present inS. cere- 
visiae wine strains but absent in the laboratory strain 
S288c (Supplementary Table S4) was found in ISAl 307 
genome. Two gene homologues of S. cerevisiae gene 
ADY2, encoding an acetate transporter, and two putative 
glycerol permeases, were also found to be present in 
ISAl 307 genome (Supplementary Table S4). 



3.8. Proteins involved in Crabtree effect regulation 

The ISAl 307 strain and also other strains belonging 
to Z. bailii species are known to have an alleviated 
Crabtree effect, being able to co-consume glucose and 
other carbon sources.^'^ ^'^^'^^ The genome sequence 
of the ISAl 307 hybrid strain and of the type strain Z. 
bailii CLIB21 3^ were searched for homologues of the 



Snfl -signalling pathway, known to play a prominent 
role in glucose repression in S. cerevisiae^^ (Table 3). 
No significant differences were registered in the 
amino acid sequence of the proteins predicted to func- 
tion in the Snfl -signalling pathway in ISAl 307 and inZ. 
bailii CLIB213^, indicating that this pathway should 
function in a similar manner in the two strains (results 
not shown). However, the organization of the Snfl 
pathway in the ISAl 307 strain and in Z. bailii 
CLIB213^ is apparently different from the one 
described in S. cerevisiae, since the regulatory subunits 
Gal83 and Sip2 are apparently fused into a single 
protein (with similarity to the protein domains found 
in the two independents, cerevisiae proteins) and only 
two Mig transcription factors are encoded by the 
genomes of the Zygosaccharomyces strains (Table 3). 
The homology between the three S. cerevisiae Mig tran- 
scription factors and the two putative Mig-like transcrip- 
tion factors found inZ. bailii CL\B2 1 3^or in the ISAl 307 
strain was limited to the DNA-binding domain, suggest- 
ing that all these transcription factors may recognize 
similar DNA-binding sites, as found in other fungi. 
Interestingly, the promoter regions of the ISAl 307 
genes predicted to encode gluconeogenic enzymes, 
enzymes of the TCA cycle or enzymes required for 
acetate or glyoxylate metabolism, all subjected to 
glucose repression in S. cerevisiae in a Migl -dependent 
manner,^^ harbour DNA motifs similar to the binding 
site described for ScMigl (results not shown). Although 
a significant difference was registered at the level of the 
transactivation domains of the two ZbMig and the 
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three ScMig transcription factors, only by the analysis of 
the genome sequence it is not possible, at this phase, to 
uncover the mechanisms underlying the different be- 
haviourof the ISAl 307 strain and ofZ fafl///; strains, com- 
pared with S. cerevisiae, concerning the Crabtree effect. 
The alleviation of the Crabtree effect was suggested to 
be behind the high intrinsic resistance of the ISAl 307 
strain, and of the Z. bailii species in general, to acetic 
acid and to other weak acids used as food preserva- 
tives.^ ^'^''^^ However, since these compounds are very 
diverse in structure, it is unlikely that the high resistance 
ofZ l7fl;7//orofthe ISAl 3 07 strain to all these weak acids 
results from the co-catabolism of all these compounds. 

3.9. Genes involved in transport and metabolism of 
amino acids and other nitrogen compounds 
The genome of the ISAl 307 strain encodes enzymes 
required for biosynthesis and catabolism of all proteo- 
genic amino acids (Supplementary Fig. S3). Around 
40% of the ISAl 307 genes included in the 'Metabolism' 
and 'Cellular transport' functional classes (Fig. 3) 
encode proteins related to amino acid metabolism or 
uptake. Remarkably, there are 1 8 predicted pyruvate 
decarboxylases (PDCs) in the genome of the ISAl 307 
strain while in Z. rouxii CBS732 and S. cerevisiae S288c 
there are only three and five proteins, respectively, with 
this function annotated (Supplementary Table S6). In 
thetypestrainZ. fafl;7/;CLIB21 3^, there are five genes en- 
coding PDC enzymes annotated (Supplementary Table 
S5), suggesting that the increase in the number of these 
genes is a particular characteristic of the ISAl 307 
strain. PDC enzymes are involved in alcoholic fermenta- 
tion (by catalysing the conversion of pyruvate to acetal- 
dehyde) and in catabolism of branched and aromatic 
amino acids through the Ehrlich pathway. The amplifica- 
tion of PDC genes in the genome of the ISAl 307 strain 
does not favour alcoholic fermentation since the alcohol- 
icfermentation rate of these cells is below the rates exhib- 
ited by Z rouxii or S. cerevisiae ceWs-J however, it may 
represent an adaptive response to the significant 
amounts of aromatic and branched amino acids (the 
main substrates of the Ehrlich pathway) that are found 
in wines,^"^ the ecological niche where this hybrid strain 
was isolated from. Thirty-six ISAl 307 genes are predicted 
to encode amino acid permeases, including general 
amino acid permeases and permeases specific for 
proline, histidine, lysine, arginine, methionine, histidine, 
branched amino acids (valine, isoleucine and leucine) 
and for neutral amino acids (Supplementary Table S6). 
Genes required for catabolism of allantoine, urea and 
the non-proteogenic amino acid GABA, as well as genes 
encoding permeases for these nitrogen sources, were 
also found in the predicted set of ISAl 307 proteins 
(Supplementary Fig. S3). Interestingly, some of the per- 
mease-encoding genes found in the ISAl 307 genome 



have homologues in S. cerevisiae strains isolated from 
wines, but not in the laboratory strain S288c 
(Supplementary Table S6). The comparison of the 
genome of several S. cerevisiae wine strains with the 
genome of the laboratory strain S2 88c strongly suggests 
that the acquisition of genes required for transport and 
metabolism of nitrogen sources results from adaptation 
to the nitrogen-depleted environment of wine musts.^^ 

3.7 0. Genes involved in meiosis and mating 

Infertility is a common characteristic of hybrid yeast 
strains due to the incompatibility of genes coming 
from the parental genomes, gross chromosomal rear- 
rangements and abnormal gene segregation, among 
other factors.^^'^^ As expected from an hybrid strain, 
ISAl 307 cells were only found to propagate byclonal ex- 
pansion through mitoticdivisions.^ ^ Other strains previ- 
ously classified as Z bailii (NCYC563, NCYC1427, 
NCYC1416 and NCYC128) were also found to be 
unable to produce meiotic spores;^^ however, it 
remains to be established if these strains do belong to 
the Z bailii species (not examined in^"^) or if they are 
hybrid strains or strains belonging to a closely related 
species. Based on the infertility phenotype exhibited by 
the ISAl 307 strain,^ ^ it was generally accepted thatZ 
l7c/;7/;cellswereunableto undergo meiosis. The predicted 
proteomeofthe ISAl 307 strain and of thetypestrainZ 
l7c/;7;7CLIB2 1 3^,whose ability to undergo meiosis is,asfar 
as we know, unknown, was searched for proteins hom- 
ologous to those described to be required for functional 
meiosis and mating in S. cerews/ae (Supplementary Table 
S7). A number of proteins demonstrated to play an es- 
sential role in meiosis in the budding yeast are apparent- 
ly not encoded by the genome of the ISAl 307 strain 
including: (i) Imel and Ume6, key transcriptional regu- 
lators of meiosis-related genes, (ii) Emil, required for 
transcriptional induction of Imel, (iii) Zip2 and Cst9, 
involved in the formation of the synaptonemal 
complex, (iv) Donl, Mpc54, Spol 6, Spo20, Spo21, 
Spo22 and Spo74, involved in the formation of the 
meiotic plate and (v) Reel 04, Zip2, Mlh2, Msh4 and 
Msh5 genes, required for the induction of meiotic re- 
combination (Supplementary Table S7). Concerning 
the molecular machinery required for mating inS. cerevi- 
s/fle, the transcription factors Digl and Dig2, required for 
the regulation of mating-specific genes and for trigger- 
ing the invasive growth pathway, also seem to be 
absent from the ISAl 307 genome (Supplementary 
Table S6). Moreover, neither the S. cerevisiae a or a 
matting cassettes (encoded in the HMRAl /HMRA2 and 
HMALPHA1 /HMALPHA2 locus) nor the corresponding 
fl or a mating factors [encoded by the MATal /MATa2 
and MAT(alphal )/MAT(alpha2) genes] were also 
found to have homologues in the ISAl 307 genome 
(Supplementary Table S6). Around half of the genes 
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required for meiosis in the budding yeast that are missing 
in the genome of the ISAl 307 strain were found in the 
genome sequence available for Z. bailii CLIB213^; 
however, this strain lacks proteins with a very 
prominent role in the meiotic process, such as Imel 
(Supplementary Tables/). HomologuestotheS. cerevi- 
siae a and a mating cassettes were also not found in 
the genome of Z. bailii CLIB213^ (Supplementary 
Table S7). Solely based on the inspection of the 
genome sequence, it is not possible to say if the infertility 
phenotype of ISAl 307 cells derives from being a hybrid 
strain or if this trait was found in the parental species, in 
particular, inZ. bailii. 

3.11. ISAl 307 genes involved in stress response 

One of the goals underlying species hybridization is 
the increase in cell robustness. Indeed, hybrid yeast 
strains isolated from the harsh environmental condi- 
tions of wine fermentations were found to be more re- 
sistant to stress than their parental species. One of 
the main phenotypic traits of ISAl 307 strain is its high 
tolerance to acetic acid stress.' Zygosaccbaromyces 
bailii species are known for being resistant to stress 
induced by several weak acids food preservatives and 
tolerant to several sanitizers and to osmotic stress 
induced by high sugar concentrations.^ Although this 
high intrinsic resilience of Z. bailii ceWs to the above re- 
ferred stresses is believed to underlie the high spoilage 
capacity of this yeast species,^ specially in acidic foods 
and drinks, the molecular mechanisms behind this 
trait are still unknown or unclear. Having this in mind, 
the genomeof the ISAl 307 strain was searched for pro- 
teins described to play a role in in S. cerevisiae response 
and resistance to weak acid food preservatives, in par- 
ticular, to acetic acid. The genome of the ISAl 307 
strain encodes one protein, encoded by the paralogous 
genes ZBAI_03 527 and ZBAI_0852 5, homologous tothe 
S. cerews/fle transcription factors Msn2 and Msn4, which 
control the transcriptional response to environmental 
stress,^^ in particular the response to weak acid food pre- 
servatives.'^ Apparently, Z. bailii CLIB2 1 3^ genome also 
encodes one protein (ZYBAOSl 7-00848gl_l) with simi- 
larity to the ScMsn2/ScMsn4 transcription factors. The 
highest degree of similarity of this putative ZbMsn2/4 
with ScMsn2 or ScMsn4 is registered at the level of the 
DNA-binding domain, mapped in the C-terminal region 
of these proteins.^^ Most of the genes involved inS. cerevi- 
siae Environmental Stress Response (ESR) are conserved in 
the genomeof the ISAl 3 07 strain and, in general, the pro- 
moter region of these putative stress-responsive genes 
harbours the STRE motif (5'-CCCCT-3', results not 
shown) for ScMsn2/ScMsn4 binding. Approximately 
98% of genes that were found to mediate MDR in S. cere- 
visiae^° are conserved in ISAl 307 and Z. bailii CLIB2 1 3^ 
genomes (Supplementary Table S8), suggesting that 



some of the mechanisms that were described to underlie 
the MDR phenomenon in the budding yeast, namely 
plasma membrane lipid composition, intracellular 
protein trafficking mediated by vesicular transport or pro- 
teosomal activity, may also be active in Z. bailii. The 
genome of the ISAl 307 strain encodes at least 63 MDR 
transporters of the ABC (28) and of the MPS (35) 
(Supplementary Table S9). The role of a number of 
these transporters in S. cerevisiae hADRhas been well docu- 
mented, in particular, the MPS transporters Azrl , Aqrl , 
Tpo2 and Tpo3,^' and the ABC transporter Pdrl 2,*^^ 
described as determinants of S. cerevisiae resistance to 
weak acids food preservatives'^ (Supplementary Table 
S9). Pour non-paralogous ISAl 307 genes are predicted 
toencodePdrl 2-like proteins; thisapparentPDR72 amp- 
lification beingan interestingobservationconsideringthe 
major role attributed to this protein in S. cerevisiae re- 
sponse and resistance to weak acid-induced stress.'^ Six 
MPS-MDR transporters of uncharacterized function that 
do not appear to have homologues either in the 
sequenced S. cerevisiae strains or in Z. rouxii CBS732 
were also found to be encoded by the genome of the 
ISAl 307 strain (Supplementary Table S9). Two of these 
transporters (encoded byZBAI_07578and by the paralo- 
gous genes ZBAI_00386/ ZBAI_01 804) do have a very 
high homology to MPS-MDR transporters from Candida 
dubliensis and Aspergillus fumigatus, suggesting that 
these genes could have been acquired by gene transfer 
(Supplementary Table S9). Interestingly, we could not 
identify in the genome ofZ bailii CL\B2^ 3^orthologues 
for these putative eight MPS-MDR transporters nor for 
the four Pdrl 2-like genes that were found in the 
genome of the ISAl 307 strain (Supplementary Table 
S9). Therefore, it is hypothesized that these genes could 
be encoded by the non-Z. bailii species genome of the 
hybrid strain ISAl 307. The vast majority of the genes 
that mediate S. cerevisiae tolerance to acetic acid, propio- 
nic acid and sorbic acids (90-95%, depending on the 
weak acid) were also found to be conserved in the 
ISAl 307 hybrid strain and in Z bailii 01^213"^" 
genomes (Supplementary Table SI 0). Among these, con- 
served genes are the key regulators of S. cerevisiae re- 
sponse to weak acid stress Haal , Warl and Riml 01 .' ^ 
Most of the genes of the Haal-, Riml 01- or Warl- 
regulons that were described in S. cerevisiae were also 
found inthe ISAl 307 predicted proteome (Supplementary 
Table SI 0), suggesting that these signalling pathways 
could also be active and play a role inthe intrinsic high re- 
sistance of this strain and of the Z. bailii species to weak 
acids food preservatives and, in particular, to acetic acid. 
Although the stress signalling pathways described for 
S. cerevisiae are well conserved in other fungi, there is evi- 
dence for a rapid adaptive evolution of these regulatory 
pathways under the environment challenges to which 
they are exposed in the different ecological niches.^^ 
The knowledge of the genome sequence of the ISAl 307 
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interspecies hybrid strain opens the door to the in silico 
and in vivo genome-wide identification of genes and 
pathways involved in stress resistance in Z bailii and in 
thisZ. fafl;7;7-derived hybrid strain, in particular, of those 
genes relevant for yeast protection against stresses 
characteristicof the wine environment. 

Supplementary data: Supplementary Data are 
available at www.dnaresearch.oxfordjournals.org. 
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